Quick Start
This guide will help you get started with the LW-BenchHub for kitchen manipulation tasks. The framework supports multiple robots, teleoperation data collection, trajectory replay, and reinforcement learning training.
Prerequisites
Before starting, ensure you have properly installed the lw_benchhub environment following the installation instructions. You should have:
- NVIDIA Isaac Lab properly configured
- All required dependencies installed via
install.sh - GPU drivers and CUDA properly set up
Teleoperation Data Collection
LW-BenchHub provides a comprehensive teleoperation system for collecting demonstration data in simulation kitchen environments. You can control various robots using different input devices to perform manipulation tasks.
Quick Start (choose a configuration)
Run the teleoperation interface by specifying a config preset name (e.g., pandaomron or g1-hand):
python ./lw_benchhub/scripts/teleop/teleop_main.py --task_config g1-hand
# or
python ./lw_benchhub/scripts/teleop/teleop_main.py --task_config pandaomron
Tip: If you prefer using teleop.sh, edit the script to set task_config to a valid preset.
Configuration Overview (teleop_base.yml)
The teleoperation system is highly configurable. Here's the default configuration structure:
# =========================
# Basic Settings
# =========================
disable_fabric: false
num_envs: 1
device: cpu
sensitivity: 1.0
step_hz: 50
# =========================
# Robot Settings
# =========================
robot: PandaOmron-Rel
robot_scale: 1.0
# =========================
# Task & Scene Settings
# =========================
task: RobocasaBaseTask
scene_backend: robocasa
task_backend: robocasa
debug_assets: null
layout: robocasakitchen
sources:
- objaverse
- lightwheel
- aigen_objs
object_projects: []
usd_simplify: false
seed: 0
# =========================
# Data Collection Settings
# =========================
record: false
dataset_file: ./datasets/dataset.hdf5
num_demos: 1
enable_debug_log: false
continue_teleop_after_success: false
# =========================
# Teleoperation Device Settings
# =========================
enable_pinocchio: false
teleop_device: keyboard
relative_control: false
first_person_view: false
enable_rendering_optimization: true
enable_multiple_viewports: false
action_delay_async: false
action_delay_time_s: 0
action_delay_type: fixed
action_buffer_size: 100
# =========================
# Video Saving Settings
# =========================
enable_cameras: false
save_video: false
video_save_dir: ./videos
video_fps: 30
# =========================
# Object Placement Retry Settings
# =========================
max_scene_retry: 4
max_object_placement_retry: 3
resample_objects_placement_on_reset: true
resample_robot_placement_on_reset: true
Customizing Your Configuration
You can create your own personal configuration in the configs directory (the config file name must be globally unique across the entire project). Simply add the following line at the top of your new config file:
_base_: teleop_base
This will inherit all settings from teleop_base.yaml, and you can freely modify any content you need. For example, you can override parameters such as robot, layout, record, num_demos, or teleop_device to customize the scene, robot, control device, and more based on your requirements.
If you want to inherit multiple configuration files, simply modify the inheritance syntax as follows:
_base_: [teleop_base, config_A, config_B, ...]
Variable override order follows the list: the later a config appears, the higher its priority.
Available Input Devices
- keyboard: Default input method with intuitive key mappings
- dualhandtracking_abs: Two-hand tracking with retargeting (recommended for G1-Hand)
- vr-controller: Supports standard VR controllers, including PICO and QUEST controllers, allowing robot teleoperation with full key and directional control compatibility.
- vr-hand: Integrate PICO/QUEST hand tracking capabilities to control the robot
- so101leader: SO101 master-slave arm control
- bi_so101leader: Dual-arm SO101 master-slave control
- For
dualhandtracking_abs, control is performed using Vision Pro. For Vision Pro configuration and usage, please refer to the IsaacLab documentation: https://isaac-sim.github.io/IsaacLab/main/source/how-to/cloudxr_teleoperation.html#cloudxr-teleoperation - For
vr-controllerandvr-hand, we use Open-TeleVision to connect PICO/QUEST devices. For instructions on how to use Open-TeleVision, please see https://github.com/OpenTeleVision/TeleVision - Controller quick guide: Use the trigger to open/close the gripper, the joystick to move the base, and the grip button to toggle base movement mode (rotate in place or translate). By default, the right controller operates the robot. For certain robots (e.g., dual-arm or X7s waist), use the left controller's trigger or joystick for control.
Trajectory Replay
After collecting teleoperation data, you can replay and analyze the trajectories using two different replay modes. This is essential for validating data quality and understanding robot behavior.
State-Based Replay
State replay reproduces the exact state of all objects in the scene, including robot poses, object positions, and articulation states. This provides perfect reproduction of the recorded session.
python ./lw_benchhub/scripts/teleop/replay_demos.py --dataset_file "/path/to/your/dataset.hdf5" --enable_cameras
Use cases:
- Analyzing manipulation strategies
- Debugging data collection issues
- Creating reference videos
- Understanding object interactions
Action-Based Replay
Action replay tests the robot's ability to reproduce behaviors by replaying only the actions, allowing the simulation to determine the resulting states. This mode tests the robustness of the recorded behaviors.
Joint Target Replay
Replays the computed joint targets, allowing the robot's control system to track these targets:
python ./lw_benchhub/scripts/teleop/replay_action_demo.py \
--dataset_file /path/to/your/dataset.hdf5 \
--replay_mode joint_target \
--enable_cameras
Features:
- Tests joint-level control directly, bypassing inverse kinematics
- Validates simulation environment consistency and stability
- Evaluates precise reproduction of joint trajectories
Action Replay
Replays the high-level actions through the full control pipeline (e.g., inverse kinematics + PD control):
python ./lw_benchhub/scripts/teleop/replay_action_demo.py \
--dataset_file /path/to/your/dataset.hdf5 \
--replay_mode action
--enable_cameras
Features:
- End-to-end action validation
- Tests complete control pipeline
- Evaluates task completion success
Replay Analysis
Both replay modes provide valuable insights:
- Trajectory smoothness: Assess motion quality
- Task success rate: Validate demonstration effectiveness
- Control stability: Identify potential control issues
- Data quality: Ensure consistent robot behavior
Reinforcement Learning
LW-BenchHub provides a complete RL pipeline integrated with SKRL. An RL run is defined by a YAML preset under configs/rl/skrl/ (e.g., g1_liftobj.yaml) that selects a robot, task, and scene. The environment itself is composed from a robot RL variant and a task RL config registered in lw_benchhub_rl/__init__.py.
Training a Policy
Launch RL training with the default preset:
bash train.sh # default preset uses LiftObj (state variant)
The script sets GPU variables internally. To use a different preset/variant, edit train.sh or run the Python entry directly:
python ./lw_benchhub/scripts/rl/train.py \
--task_config lerobot_liftobj_state \
--headless \
Advanced Training Options
Use your own preset under configs/rl/skrl/ (e.g., my_robot_my_task.yaml):
_base_: rl_base
task: MyTask
robot: MyRLRobot
layout: AnyLayout
rl: MyRLConfig
num_envs: 128
usd_simplify: true # USD simplify enable
enable_cameras: false
Then train:
python ./lw_benchhub/scripts/rl/train.py --task_config my_robot_my_task --headless
RL configuration parameters
Below is a brief guide to the main parameters used by the RL presets and the agent configuration.
Environment preset (rl_base.yml)
remote_protocal: ipc # restful or ipc
for_rl: true
concatenate_terms: false
disable_fabric: false # Disable fabric and use USD I/O operations
device: cuda:0
# Environment Configuration
num_envs: 10 # Number of environments to simulate
layout: robocasakitchen # Scene layout
sources: # Object source name
- objaverse
- lightwheel
- aigen_objs
object_projects: [] # Object project name
scene_backend: robocasa # Scene backend name
task_backend: robocasa # task backedn name
debug_assets: null # Debug assets name
robot: null # Robot cfg name
task: null # Task cfg name
rl: null # RL cfg name
variant: null # RL variant name, define when register the RL config
usd_simplify: false # USD simplify enable
enable_rendering_optimization: true # Enable optimization
# Asset and Export Paths
export_base_path: null # Export base path (replace with actual ASSET_PATH if defined in code)
robot_scale: 1.0 # Robot scale
first_person_view: false # Enable first-person view
# Reinforcement Learning Settings
video: false # Record videos during training
video_length: 200 # Length of recorded video (in steps)
video_interval: 2000 # Interval between video recordings (in steps)
seed: null # Seed used for the environment
distributed: false # Run training with multiple GPUs or nodes
max_iterations: null # Max RL policy training iterations
# ML Framework Settings
ml_framework: "torch" # The ML framework used for training
# Options: ["torch", "jax", "jax-numpy"]
algorithm: "PPO" # The RL algorithm used for training
# Options: ["AMP", "PPO", "IPPO", "MAPPO"]
# Checkpoint
checkpoint: null
# Play Settings
use_pretrained_checkpoint: false
real_time: false
enable_cameras: false
check_success: true
- Effective fabric usage is computed as
use_fabric = not disable_fabricin the code. num_envstrades off throughput vs. GPU memory.algorithmselects which SKRL config key is used internally.
Task preset (e.g., g1_liftobj_state.yaml)
_base_: rl_base
task: LiftObj # Task name registered in lw_benchhub_rl
robot: G1-RL # Robot RL variant (action heads / EE frames)
layout: robocasakitchen-1-8 # Kitchen scene layout
rl: G1LiftObjStateRL # RL Config
num_envs: 1024 # Override of rl_base default
usd_simplify: true # Override for faster loading
enable_cameras: false # State variant does not require cameras (set true for Visual)
- The final environment is composed from the RL config registered in
lw_benchhub_rl/__init__.py. - You can override any key from
rl_base.ymlin the task preset.
Agent configuration (SKRL) — example fields
models:
separate: false # Single shared backbone for policy/value (false) or separate nets (true)
policy:
class: GaussianMixin # Stochastic Gaussian policy (continuous actions)
clip_log_std: true # Clamp log_std to [min_log_std, max_log_std]
min_log_std: -20.0
max_log_std: 2.0
initial_log_std: 0.0
network:
- name: net
input: STATES # Input tensor spec (auto: STATES/ACTIONS)
layers: [512, 256, 128, 64]
activations: elu
output: ACTIONS
value:
class: DeterministicMixin # Scalar value head
network:
- name: net
input: STATES
layers: [512, 256, 128, 64]
activations: elu
output: ONE
memory:
class: RandomMemory # Rollout buffer backend
memory_size: -1 # Auto = agent.rollouts
agent:
class: PPO
rollouts: 24 # Steps per environment before update (episode fragments)
learning_epochs: 8 # Optimization epochs per update
mini_batches: 4 # Gradient batches per epoch
discount_factor: 0.99 # Gamma
lambda: 0.95 # GAE lambda
learning_rate: 1.0e-04
learning_rate_scheduler: KLAdaptiveLR
learning_rate_scheduler_kwargs:
kl_threshold: 0.01 # Target KL for LR adaptation
state_preprocessor: RunningStandardScaler
value_preprocessor: RunningStandardScaler
random_timesteps: 0 # Warmup steps with random actions
learning_starts: 0 # Start updates after N steps
grad_norm_clip: 1.0
ratio_clip: 0.2 # PPO epsilon clip
value_clip: 0.2
clip_predicted_values: true
entropy_loss_scale: 0.001 # Entropy bonus
value_loss_scale: 2.0
kl_threshold: 0.0 # Additional KL early stop (if used)
rewards_shaper_scale: 0.01 # Global reward scaling factor
time_limit_bootstrap: false # Bootstrap value at time limit
experiment:
directory: "lift_obj" # Log root under policy/skrl/logs/
experiment_name: "" # Optional suffix
write_interval: auto
checkpoint_interval: auto
trainer:
class: SequentialTrainer
timesteps: 72000 # Total environment steps (num_envs * steps)
environment_info: log
- Effective batch size per update roughly scales with
rollouts * num_envs(and sequence length if recurrent). - Increase
entropy_loss_scaleto encourage exploration; lower it for exploitation. - Use
learning_rate_schedulerwithkl_thresholdto stabilize training automatically.
Additional RL Configuration (rl field) - Example
@rl_on(task=task_A)
@rl_on(task=task_B)
@rl_on(embodiment=robot_A)
@rl_on(embodiment=robot_B)
class TaskRobotRL(LwRL):
# Example RL configuration for reward and observation tailored to a series of tasks (task_A, task_B, ...)
# and a series of robots (robot_A, robot_B, ...).
def __init__(self):
super().__init__()
self.policy_observation_cfg = TaskRobotPolicyObsCfg()
self.rewards_cfg = RewardsCfg()
self.events_cfg = EventCfg()
def setup_env_config(self, orchestrator):
# Set specific reward/observation configuration, link and eef names for your RL task
pass
def modify_env_cfg(self, env_cfg):
# Set special environment configuration variables, such as the end-effector (ee) frame cfg
pass
GPU Configuration
The training script automatically handles GPU allocation:
- Single GPU: Uses one GPU for both environment and policy
- Multi-GPU: Distributes environment and policy across different GPUs
- CPU fallback: Supports CPU-only training for development
Policy Evaluation
Test your trained policies using the evaluation script:
bash eval.sh # uses task_config=lerobot_liftobj_state_play by default
Or run custom evaluation:
python ./lw_benchhub/scripts/rl/play.py \
--task_config lerobot_liftobj_state_play \
# --enable_camera \
# --headless
Training Monitoring
- Tensorboard: Monitor training progress and metrics
- Video logging: Automatic recording of policy performance
- Checkpoint saving: Regular model checkpointing for recovery
- Evaluation episodes: Periodic policy evaluation during training
Next Steps
Once you're familiar with the basic workflow, explore these advanced features:
- Custom Tasks: Create new manipulation tasks by extending the base task classes
- Robot Integration: Add support for new robot configurations
- Multi-Modal Learning: Combine teleoperation data with RL training
- Scene Customization: Design custom kitchen layouts and scenarios
- Algorithm Integration: Implement new RL algorithms using the SKRL framework
For detailed information about each component, refer to the comprehensive documentation and API references.
Troubleshooting
Common Issues
- GPU Memory: Reduce
num_envsif encountering memory issues - Control Instability: Adjust
sensitivityandstep_hzparameters - Rendering Issues: Check Isaac Lab installation and graphics drivers
- Data Collection: Ensure proper
record: trueand validdataset_filepath
Performance Optimization
- Use
--headlessfor faster training without visualization - Enable
--enable_cameraonly when needed for vision-based policies - Optimize
step_hzbased on your hardware capabilities - Use appropriate
devicesetting (cuda/cpu) for your setup
This framework provides a solid foundation for kitchen manipulation research, supporting everything from basic teleoperation to advanced reinforcement learning with multiple robot platforms.